AITopics

Country:

North America > United States (0.46)
Europe (0.46)
Asia (0.28)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Security & Privacy (0.47)
Information Technology > Artificial Intelligence > Machine Learning (0.34)

Neural Information Processing SystemsSep-30-2025, 10:22:40 GMT

Extremal Mechanisms for Local Differential Privacy

Local differential privacy has recently surfaced as a strong measure of privacy in contexts where personal information remains private even from data analysts. Working in a setting where the data providers and data analysts want to maximize the utility of statistical inferences performed on the released data, we study the fundamental tradeoff between local differential privacy and information theoretic utility functions. We introduce a family of extremal privatization mechanisms, which we call staircase mechanisms, and prove that it contains the optimal privatization mechanism that maximizes utility. We further show that for all information theoretic utility functions studied in this paper, maximizing utility is equivalent to solving a linear program, the outcome of which is the optimal staircase mechanism. However, solving this linear program can be computationally expensive since it has a number of variables that is exponential in the data size. To account for this, we show that two simple staircase mechanisms, the binary and randomized response mechanisms, are universally optimal in the high and low privacy regimes, respectively, and well approximate the intermediate regime.

extremal mechanism, local differential privacy, mechanism, (8 more...)

Technology: Information Technology > Artificial Intelligence (0.64)

Neural Information Processing SystemsAug-16-2025, 14:42:20 GMT

Truthful Data Acquisition via Peer Prediction

Depending on the data distribution, the mechanisms can also discourage mis-reports that would lead to inaccurate predictions.

data provider, dataset, mechanism, (16 more...)

Country:

North America > United States (0.14)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.70)

arXiv.org Artificial IntelligenceMar-26-2025

Dynamic Learning and Productivity for Data Analysts: A Bayesian Hidden Markov Model Perspective

Yin, Yue

Data analysts are essential in organizations, transforming raw data into insights that drive decision-making and strategy. This study explores how analysts' productivity evolves on a collaborative platform, focusing on two key learning activities: writing queries and viewing peer queries. While traditional research often assumes static models, where performance improves steadily with cumulative learning, such models fail to capture the dynamic nature of real-world learning. To address this, we propose a Hidden Markov Model (HMM) that tracks how analysts transition between distinct learning states based on their participation in these activities. Using an industry dataset with 2,001 analysts and 79,797 queries, this study identifies three learning states: novice, intermediate, and advanced. Productivity increases as analysts advance to higher states, reflecting the cumulative benefits of learning. Writing queries benefits analysts across all states, with the largest gains observed for novices. Viewing peer queries supports novices but may hinder analysts in higher states due to cognitive overload or inefficiencies. Transitions between states are also uneven, with progression from intermediate to advanced being particularly challenging. This study advances understanding of into dynamic learning behavior of knowledge worker and offers practical implications for designing systems, optimizing training, enabling personalized learning, and fostering effective knowledge sharing.

artificial intelligence, machine learning, query, (16 more...)

2503.20233

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Peter Kairouz, Sewoong Oh, Pramod Viswanath

Extremal Mechanisms for Local Differential Privacy

Neural Information Processing SystemsFeb-9-2025, 11:20:35 GMT

Local differential privacy has recently surfaced as a strong measure of privacy in contexts where personal information remains private even from data analysts. Working in a setting where the data providers and data analysts want to maximize the utility of statistical inferences performed on the released data, we study the fundamental tradeoff between local differential privacy and information theoretic utility functions. We introduce a family of extremal privatization mechanisms, which we call staircase mechanisms, and prove that it contains the optimal privatization mechanism that maximizes utility. We further show that for all information theoretic utility functions studied in this paper, maximizing utility is equivalent to solving a linear program, the outcome of which is the optimal staircase mechanism. However, solving this linear program can be computationally expensive since it has a number of variables that is exponential in the data size. To account for this, we show that two simple staircase mechanisms, the binary and randomized response mechanisms, are universally optimal in the high and low privacy regimes, respectively, and well approximate the intermediate regime.

artificial intelligence, data mining, mechanism, (18 more...)

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
Asia > Middle East > Jordan (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.55)

arXiv.org Artificial IntelligenceMay-11-2024

T-curator: a trust based curation tool for LOD logs

Lanasri, Dihia

Nowadays, companies are racing towards Linked Open Data (LOD) to improve their added value, but they are ignoring their SPARQL query logs. If well curated, these logs can present an asset for decision makers. A naive and straightforward use of these logs is too risky because their provenance and quality are highly questionable. Users of these logs in a trusted way have to be assisted by providing them with in-depth knowledge of the whole LOD environment and tools to curate these logs. In this paper, we propose an interactive and intuitive trust based tool that can be used to curate these LOD logs before exploiting them. This tool is proposed to support our approach proposed in our previous work Lanasri et al. [2020].

lod log, operator, query, (15 more...)

2405.07081

Country: Africa > Middle East > Algeria > Algiers Province > Algiers (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Web > Semantic Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Neural Information Processing SystemsMar-13-2024, 10:27:55 GMT

Extremal Mechanisms for Local Differential Privacy Peter Kairouz Sewoong Oh

Local differential privacy has recently surfaced as a strong measure of privacy in contexts where personal information remains private even from data analysts. Working in a setting where the data providers and data analysts want to maximize the utility of statistical inferences performed on the released data, we study the fundamental tradeoff between local differential privacy and information theoretic utility functions. We introduce a family of extremal privatization mechanisms, which we call staircase mechanisms, and prove that it contains the optimal privatization mechanism that maximizes utility. We further show that for all information theoretic utility functions studied in this paper, maximizing utility is equivalent to solving a linear program, the outcome of which is the optimal staircase mechanism. However, solving this linear program can be computationally expensive since it has a number of variables that is exponential in the data size. To account for this, we show that two simple staircase mechanisms, the binary and randomized response mechanisms, are universally optimal in the high and low privacy regimes, respectively, and well approximate the intermediate regime.

differential privacy, mechanism, staircase mechanism, (15 more...)

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
Asia > Middle East > Jordan (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.55)

arXiv.org Artificial IntelligenceFeb-15-2024

DPBalance: Efficient and Fair Privacy Budget Scheduling for Federated Learning as a Service

Liu, Yu, Wang, Zibo, Zhu, Yifei, Chen, Chen

Federated learning (FL) has emerged as a prevalent distributed machine learning scheme that enables collaborative model training without aggregating raw data. Cloud service providers further embrace Federated Learning as a Service (FLaaS), allowing data analysts to execute their FL training pipelines over differentially-protected data. Due to the intrinsic properties of differential privacy, the enforced privacy level on data blocks can be viewed as a privacy budget that requires careful scheduling to cater to diverse training pipelines. Existing privacy budget scheduling studies prioritize either efficiency or fairness individually. In this paper, we propose DPBalance, a novel privacy budget scheduling mechanism that jointly optimizes both efficiency and fairness. We first develop a comprehensive utility function incorporating data analyst-level dominant shares and FL-specific performance metrics. A sequential allocation mechanism is then designed using the Lagrange multiplier method and effective greedy heuristics. We theoretically prove that DPBalance satisfies Pareto Efficiency, Sharing Incentive, Envy-Freeness, and Weak Strategy Proofness. We also theoretically prove the existence of a fairness-efficiency tradeoff in privacy budgeting. Extensive experiments demonstrate that DPBalance outperforms state-of-the-art solutions, achieving an average efficiency improvement of $1.44\times \sim 3.49 \times$, and an average fairness improvement of $1.37\times \sim 24.32 \times$.

data analyst, fairness, pipeline, (14 more...)

2402.09715

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Cheng, Liying, Li, Xingxuan, Bing, Lidong

Is GPT-4 a Good Data Analyst?

arXiv.org Artificial IntelligenceOct-22-2023

As large language models (LLMs) have demonstrated their powerful capabilities in plenty of domains and tasks, including context understanding, code generation, language generation, data storytelling, etc., many data analysts may raise concerns if their jobs will be replaced by artificial intelligence (AI). This controversial topic has drawn great attention in public. However, we are still at a stage of divergent opinions without any definitive conclusion. Motivated by this, we raise the research question of "is GPT-4 a good data analyst?" in this work and aim to answer it by conducting head-to-head comparative studies. In detail, we regard GPT-4 as a data analyst to perform end-to-end data analysis with databases from a wide range of domains. We propose a framework to tackle the problems by carefully designing the prompts for GPT-4 to conduct experiments. We also design several task-specific evaluation metrics to systematically compare the performance between several professional human data analysts and GPT-4. Experimental results show that GPT-4 can achieve comparable performance to humans. We also provide in-depth discussions about our results to shed light on further studies before reaching the conclusion that GPT-4 can replace data analysts.

data analyst, gpt-4, information, (16 more...)

2305.15038

Country:

Asia > Singapore (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > California (0.04)
(3 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (1.00)
Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceApr-8-2023, 09:32:58 GMT

Data Analyst (f/m/div.) at Bosch Group - Abstatt, Germany

At Bosch, we shape the future by inventing high-quality technologies and services that spark enthusiasm and enrich people's lives. Our promise to our associates is rock-solid: We grow together, we enjoy our work, and we inspire each other. The Robert Bosch GmbH is looking forward to your application! You want to work remotely or part-time - we offer great opportunities for mobile working as well as different part-time models or job-sharing. Feel free to contact us. Diversity and inclusion are not just trends for us but are firmly anchored in our corporate culture.

application, bosch group, data analyst, (3 more...)

#artificialintelligence

Country: Europe > Germany (0.40)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.40)
Information Technology > Artificial Intelligence (0.40)